ARCHER 3.2

Introduction

Project co-ordinators: David Denison and Nuria Yáñez-Bouza (since 2008)
Time of compilation: 2009–2012
Size: ca. 3.3 million words (ca. 2 million in British English, ca. 1.3 million in American English
Samples: 1,710 files (1,075 British English files, 635 American English files)
Language: English (British and American)
Period: 1600–1999
Version: 2013
Project home page:http://www.manchester.ac.uk/archer/

Aims and scope

The current version of the ARCHER corpus is known as ARCHER 3.2 and was completed in 2013. David Denison and Nuria Yáńez-Bouza have coordinated the work carried out in the international consortium. The coordinators may be contacted via archer@manchester.ac.uk.


The phase of the project towards ARCHER 3.2 has enhanced the usefulness of the corpus in a number of ways over the triennial project periods 2008–2011 and 2011–2013. It has been a new stage of expansion in terms of diachronic and textual coverage, with the added value of moving to XML-compliant markup and (soon) developing a complete POS-tagged version of the whole corpus with a current tagger and a standard tagset, plus a second version tagged and parsed to different standards. (See further below.) Textual accuracy and consistency in the provision of bibliographic information has also been improved. Further details are given in the section Structure below.


ARCHER 3.2 includes all the files in ARCHER 3.1 plus subsequent materials compiled since 2006. We have also restored the ARCHER 1 and ARCHER 2 files that had been omitted from ARCHER 3.1. Altogether, this new version has improved radically in size (more than doubled), text type coverage (four more genres) and regional coverage (especially the American variety). In toto, ARCHER 3.2 consists of ca. 3.3 million words (3,298,080 words = 1,957,499 British + 1,340,581 American) and 1,710 files (1,075 British + 635 American).

Note: Work at Manchester was financially supported by the British Academy from 1 June 2010 to 31 December 2011 (extended to 1 June 2012)

Consortium

The current consortium universities and members are listed below in alphabetic order by country, , and the individual to contact for details of local access is hyperlinked. For a list of former research members who contributed to 3.2, please see below.


Finland

  • University of Helsinki (Helsingin yliopisto), Department of Modern Languages,  English Philology (Nykykielten laitos, Englantilainen filologia), VARIENG research unit: Matti Rissanen, Minna Palander-Collin, Turo Hiltunen

Germany

  • University of Bamberg (Otto-Friedrich-Universität Bamberg), Department of English (Lehrstuhl für Englische Sprachwissenschaft einschließlich Sprachgeschichte): Manfred Krug
  • University of Freiburg (Albert-Ludwigs-Universität Freiburg), Department of English (Englisches Seminar): Christian Mair, Bernd Kortmann.
  • University of Heidelberg (Ruprecht-Karls-Universität Heidelberg), Department of English (Anglistisches Seminar): Nadja Nesselhauf
  • University of Trier (Universität Trier), Department of English Studies (Fachbereich Anglistik): Sebastian Hoffmann.

Spain

  • University of Santiago de Compostela, Research Unit on Variation, Linguistic Change and Grammaticalization, Department of English and German (Departamento de Filoloxía Inglesa e Alemá): Teresa Fanego, María José López-Couso, Belén Méndez-Naya, Paloma Núñez-Pertejo.

Sweden

  • Uppsala University (Uppsala universitet), Department of English (Engelska institutionen): Merja Kytö

Switzerland

  • University of Zurich (Universität Zürich), Department of English (Englisches Seminar): Gerold Schneider.

UK

  • Lancaster University, UCREL Research Centre: Paul Rayson.
  • University of Leicester, School of Education/School of English: Nick Smith.
  • University of Manchester, Linguistics and English Language: David Denison, Nuria Yáñez-Bouza.

USA

  • University of Michigan, Department of English: Anne Curzan.
  • Northern Arizona University (NAU), Department of English: Douglas Biber.
  • University of Southern California (USC), Department of Linguistics: Edward Finegan.

The ARCHER teams are willing to host visits by interested scholars who wish to consult ARCHER. Please email the project coordinators at archer@manchester.ac.uk for contact details of the responsible person at one of these departments.

Documentation

The following documents have been prepared for the ARCHER 3.2 version; they can be consulted at the ARCHER website maintained at Manchester.


General

  • User agreement form (PDF)
  • Intellectual Property Rights (copyright holders) (PDF)
  • Poster based on 'ARCHER past and present (1990-2011)' presented by Nuria Yáñez-Bouza at ICAME 32, updated to October 2013 (PDF)
  • Using ARCHER online (PDF)

Specific to ARCHER 3.2

  • Number of files and words in the XML and text versions (MS Excel)
  • Style sheet for XML reader (CSS)
  • PERL script by Sebastian Hoffmann used for word count (TXT)
  • Complete word list, with frequencies (TXT)
  • List of special characters and how they are coded (TXT)
  • Comprehensive bibliographic spreadsheet of texts, as used to generate the TEI headers, including word count, previous filenames, compilation notes, etc.
  • Ongoing list of errors for correction in ARCHER 3.3 (MS Word)

We are compiling a list of errors in version 3.2 for correction in ARCHER 3.3. Please notify us of possible items by an email to archer@manchester.ac.uk with the subject header ‘ARCHER text correction’.